We propose a new differentially-private decision forest algorithm thatminimizes both the number of queries required, and the sensitivity of thosequeries. To do so, we build an ensemble of random decision trees that avoidsquerying the private data except to find the majority class label in the leafnodes. Rather than using a count query to return the class counts like thecurrent state-of-the-art, we use the Exponential Mechanism to only output theclass label itself. This drastically reduces the sensitivity of the query --often by several orders of magnitude -- which in turn reduces the amount ofnoise that must be added to preserve privacy. Our improved sensitivity isachieved by using "smooth sensitivity", which takes into account the specificdata used in the query rather than assuming the worst-case scenario. We alsoextend work done on the optimal depth of random decision trees to handlecontinuous features, not just discrete features. This, along with several otherimprovements, allows us to create a differentially private decision forest withsubstantially higher predictive power than the current state-of-the-art.
展开▼